[Gemma x Gemini CLI] Add an Experimental Gemma Router that uses a LiteRT-LM shim into the Composite Model Classifier Strategy#17231
Conversation
|
Thanks for your pull request! It looks like this may be your first contribution to a Google open source project. Before we can look at your pull request, you'll need to sign a Contributor License Agreement (CLA). View this failed invocation of the CLA check for more information. For the most up to date status, view the checks section at the bottom of the pull request. |
Summary of ChangesHello @sidwan02, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request integrates an experimental Gemma Router into the CLI's model routing mechanism. The primary goal is to enhance the system's ability to intelligently select the most suitable large language model (either a 'flash' or 'pro' model) for a given user request, based on its perceived complexity. This is achieved by leveraging a local Gemini-compatible API, allowing for more efficient and context-aware model dispatching without relying solely on external services. Highlights
Ignored Files
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
0ed6b9a to
26900c9
Compare
There was a problem hiding this comment.
Code Review
This pull request introduces an experimental Gemma-based model router, utilizing a LocalGeminiClient for local Gemini-compatible API communication and a GemmaClassifierStrategy to route user requests. A high-severity security issue was identified in the LocalGeminiClient due to the unconditional writing of sensitive LLM interaction data (prompts and responses) to a local file (router.txt), which poses a significant data leakage risk and requires remediation. Furthermore, the review highlighted a critical issue in chat history processing for the classifier, potentially impacting accuracy, and another high-severity issue in the LocalGeminiClient concerning incorrect role mapping.
|
Hi there! Thank you for your contribution to Gemini CLI. To improve our contribution process and better track changes, we now require all pull requests to be associated with an existing issue, as announced in our recent discussion and as detailed in our CONTRIBUTING.md. This pull request is being closed because it is not currently linked to an issue. You can easily reopen this PR once you have linked it to an issue. How to link an issue: Thank you for your understanding and for being a part of our community! |
|
Go ahead and rebase and resolve conflicts, and work through the bot feedback, and then I'll take a deeper look and run the tests. |
|
/gemini review |
There was a problem hiding this comment.
Code Review
This pull request introduces an experimental Gemma router that uses a LiteRT-LM shim for model classification. A high-severity security issue was identified in the LocalGeminiClient class, where sensitive chat history is unconditionally written to a local file, posing a significant privacy risk. Additionally, a critical issue involves a debug feature writing files into the user's workspace, and a high-severity issue concerns role handling in the new client, which could lead to silent failures.
ea53d90 to
9da39ce
Compare
…assifier within composite strategy.
…g port is provided.
…l + fix router tests.
|
Thanks Allen!
|
…ing no-api-key for LiteRT-LM shim
…andle abortsignal and removed jsonrepair
02f056c to
fca16d8
Compare
allenhutchison
left a comment
There was a problem hiding this comment.
Thank you for addressing the comments I left last time. I think this is ready to go in. I did find one bug when using vertex instead of vanilla api keys, I shared the details offline.
9b7852f
…eRT-LM shim into the Composite Model Classifier Strategy (google-gemini#17231) Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com> Co-authored-by: Allen Hutchison <adh@google.com>
…eRT-LM shim into the Composite Model Classifier Strategy (google-gemini#17231) Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com> Co-authored-by: Allen Hutchison <adh@google.com>
…eRT-LM shim into the Composite Model Classifier Strategy (google-gemini#17231) Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com> Co-authored-by: Allen Hutchison <adh@google.com>
Summary
This pull request introduces an experimental Gemma-based model routing strategy, including a new client for a local LiteRT-LM shim and its integration into the model router service.
Details
Related Issues
Related to https://github.com/google-gemini/maintainers-gemini-cli/issues/1222
How to Validate
Please follow the instructions in go/gcli-litertlm-setup-pr
A separate PR with instructions will be merged shortly after this one.
Pre-Merge Checklist